In contemplating statistical testing in a fourfold table, consider the process. As described in Chapter
12, you first formulate a null hypothesis (H0) about the fourfold table, set the significance level (such
as α = 0.05), calculate a test statistic, find the corresponding p value, and interpret the result. With a
fourfold table, one obvious test to use is the chi-square test (if necessary assumptions are met). The
chi-square test evaluates whether membership in a particular row is statistically significantly
associated with membership in a particular column. The p value on the chi-square test is the
probability that random fluctuations alone, in the absence of any real effect in the population, could
have produced an observed effect at least as large as what you saw in your sample. If the p value is
less than α (which is 0.05 in your scenario), the effect is said to be statistically significant, and the
null is rejected. Assessing significance using a chi-square test is the most common approach to testing
a cross-tab of any size, including a fourfold table. But fourfold tables can serve as the basis for
developing other metrics besides chi-square tests that can be useful in other ways, which are discussed
in this chapter.
In the rest of this chapter, we describe many useful calculations that you can derive from the
cell counts in a fourfold table. The statistical software that cross-tabulates your raw data can
provide these indices depending upon the commands it has available (see Chapter 4 for a review
of statistical software). Thankfully (and uncharacteristically), unlike in most chapters in this
book, the formulas for many indices derived from fourfold tables are simple enough to do
manually with a calculator (or using Microsoft Excel). All you need are the counts or frequencies
of each of the four cells. For these indices, you can also use a web page for calculation, which is
available here: https://statpages.info/ctab2x2.html. This chapter demonstrates how to
calculate these indices in R (a free, open-source software described in Chapter 4).
Like any other value you calculate from a sample, an index calculated from a fourfold table is a
sample statistic, which is an estimate of the corresponding population parameter. A good researcher
always wants to quote the precision of that estimate. In Chapter 10, we describe how to calculate the
standard error (SE) and confidence interval (CI) for sample statistics such as means and proportions.
Likewise, in this chapter, we show you how to calculate the SE and CI for the various indices you can
derive from a fourfold table.
Though an index itself may be easy to calculate manually, its SE or CI usually is not. Approximate
formulas are available for some of the more common indices. These formulas are usually based on the
fact that the random sampling fluctuations of an index (or its logarithm) are often nearly normally
distributed if the sample size is large enough. We provide approximate formulas for SEs where they’re
available, and demonstrate how to calculate them in R when possible.
For consistency, all the formulas in this chapter refer to the four cell counts of the fourfold
table, and the row totals, column totals, and grand total, in the same standard way (see Figure 13-
1). This convention is used in many online resources and textbooks.